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Abstract: This paper proposes a model of interactions between two point 
processes, ruled by a reproduction function h, which is considered as the 
intensity of a Poisson process. In particular, we focus on the context of neu- 
rosciences to detect possible interactions in the cerebral activity associated 
with two neurons. To provide a mathematical answer to this specific prob- 
lem of neurobiolo gists, we address so the question of testing the nullity of the 
intensity h. We construct a multiple testing procedure obtained by the aggre- 
gation of single tests based on a wavelet thresholding method. This test has 
good theoretical properties: it is possible to guarantee the level but also the 
power under some assumptions and its uniform separation rate over weak 
Besov bodies is adaptive minimax. Then, some simulations are provided, 
showing the good practical behavior of our testing procedure. 
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Introduction 

In neurosciences, an important issue lies in a better understanding of the dynamics of cerebral activity 
in the cortex. In practice it is possible to measure, in vivo and for a specific task, the cerebral 
activity through the emission of action potentials by several neurons, and the specific interest of the 
neurobiologists is to understand how these action potentials appear. During a task, the recording of 
all arrival times of these action potentials (or spikes) on a neuron forms a spike train. From this point 
of view, the spike train can be modeled by a point process. 

Several years ago it was thought that activities of different neurons during a task were independent 
(for example, see Barlow [3JJ; this explains why in the studies, the spike trains were usually modeled 
by independent Poisson processes. Today, thanks to technological advances in terms of recording brain 
activity, various studies show that this belief is false (for instance, see Gerstein |13) and Lestienne |24|). 
Thus the recent studies consider neuronal assemblies instead of the separate neuronal activities. For 
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example activities of pairs of neurons, that have been recorded simultaneously, show that there exists a 
phenomenon called synchronization (see Grammont and Riehle [2] and Grim et al. |16j): the presence 
of a spike on one of the two spike trains can affect the apparition of a spike, with a delay, on the 
second spike train. From a biological point of view, such a phenomenon reflects a reality. Indeed, an 
action potential appears if the neuron is sufficiently excited. However, to obtain a sufficient excitation, 
two strategies exist: either the frequency of spikes received by a single neuron increases, either the 
receiving neuron receives less spikes but at the same time from different neurons. This second strategy 
is precisely the synchronization. From a biological point of view, it is less energy consuming and 
the reaction is faster. Therefore, the neurobiologists are interested in detecting the synchronization 
phenomenon. More generally, they want to detect whether or not neurons evolve independently of 
each other, a dependence being a hint of a functional connection during a task. 

To mathematically answer this question, we need a model taking into account the possible inter- 
actions between two neurons. In neurosciences, a possible model is the Hawkes process (for instance, 
see |18j for theoretical aspects and [5j |23l [27J EH] for its introduction in neurosciences). However, the 
Hawkes process is, theoretically speaking, a very complicated model, thus we consider a simpler version 
of Hawkes process which is realistic for the possible applications (in neurosciences, or in genomics, . . . ) 
and for which it is possible to carry out computations. Our model is the following one. Let N p and 
N c be two point processes with respective conditional intensity 

\p : 1 1 — > Hp and A c : 1 1 — > fj, c + / h(t — u) dN p (u) , (1.1) 

J — oo 

where [ip an d are positive parameters describing the spontaneous part (in the context of neuro- 
sciences, the spontaneous apparition of spikes) and h is a function which reflects the influence of N p 
on N c . In this model, we have to assume that supp(h) C R5_, where supp(h) is the support of the 
function h. Moreover, N p is a homogeneous Poisson process (for instance, see |22| ) and iV c is a special 
case of Hawkes process. The biological problem which consists in knowing whether N p influences N c 
is equivalent to test the null hypothesis Hq: "/i = 0" against the alternative %\: "h ^ 0". 

The above formulation of A c is an integral form. However it is possible conditionally on the points 
of N p to have a vision in terms of descendants and no more in terms of intensity conditionally on 
the only past observations. Indeed, given T a positive real number representing the time of record of 
the neuronal activity and given n an integer, conditionally on the event "the number of points of N p 
lying in [0;T] is n", the points of the process N p obey the same law as a n-sample of uniform random 
variables on [0; T], denoted U±, . . . ,U n and named parents. Thus, conditionally on U\, . . . , U n , we can 
write X c (t) = fi c + Ya=1 — This new expression of A c can be interpreted as follows. Each Ui 
gives birth independently to a Poisson process iV* with intensity the function t i — > h(t — U) with 
respect to the Lebesgue measure on R, to which is added a homogeneous Poisson process AT? with 
constant intensity fj, c , representing the orphans. We consequently consider the aggregated process 

n n 

N c = whose intensity is given by the function 1 1 — > He + h(t — U) (1-2) 

i=0 i=l 

and the points of the process N c are named children. With this interpretation, the goal of the present 
paper is to test the "influence or not" of the parents on the children, via the reproduction function h. 
This second writing contains many benefits. First, the assumption supp{h) C Ml is not mandatory. 
With respect to the first formulation, this may appear like a minor difference, but in practice the 
impact is considerable. Indeed, if we refer to the context of neurosciences, assuming that the support 
of h is in M + means that one favors a sense of interactions, namely N p affects N c . However in practice, 
we do not have this information a priori. Therefore, when the test does not reject Hq, it means that N p 
does not seem to influence N c , but this may be because in reality this is N c that affects N p . We must 
be careful that the initially proposed model is not symmetric in terms of neurons and that a support in 
does not really allow to answer the question of dependence. The causality is indeed represented by 
the fact that a child appears after its parent and therefore h has to be supported in R + . Heuristically, a 
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consequence is the following interpretation: if a parent has a child before its own birth, it may represent 
that the child is the parent and the parent the real child. Looking at both sides of the support makes 
the procedure in some sense adaptive to the causality of parent/child roles. Another advantage of this 
second writing is that it allows applications to other disciplines such as genomics where one studies 
for instance the favored or avoided distances between patterns on a strand of DNA and where it is not 
always possible to know which pattern rules the other. More details about this application to genomics 
can be find in Sansonnet [33], where the author proposes an estimation procedure of the function h, 
assumed to be well localized, based on wavelet thesholding methods, in a very similar model to the 
one studied here. The interested reader will find other estimation procedures of the function h in this 
DNA context, by using a Hawkes' model in Gusto and Schbath [T7] and Reynaud-Bouret and Schbath 
[30J. In this paper, we consider the model defined by (|l,2p . For the simulation study, parents process 
(Ui)i is simulated by a homogenous Poisson process of intensity [i v . 

Since the null hypothesis "h = 0" means that conditionally on the total number of points 

of N c , the points of the process N c are i.i.d. (independent and identically distributed) with uniform 
distribution, a first rather naive approach is to perform a Kolmogorov-Smirnov test (see for instance 
[8]). But this test is not powerful, as illustrated in the section devoted to the simulations. The aim 
of this paper is then to build a more powerful and nonparametric test <I? Q with values in {0, 1} of T~Lq: 
"h = 0" against the alternative H\; "h ^ 0", rejecting %q when $ a = 1, with prescribed probabilities 
of first and second kind errors. The performance of the test $ Q is measured by its uniform separation 
rate (see for instance [I]). 

In neurosciences, parametric methods exist to detect such dependence. For instance, the Unitary 
Event (UE) (see [16J) and the Multiple Tests based on a Gaussian Approximation of the UE (MTGAUE) 
(see [35J) methods answer partially the problem by considering coincidences (see Section H] for more 
details). In the one-sample Poisson process model (that is to say n = 1 and fi c = in our model), 
many papers deal with different problems of testing the simple hypothesis that an observed point 
process is a Poisson process with a known intensity. We can cite for instance the papers by Fazli 
and Kutoyants |10j where the alternative is also a Poisson process with a known intensity, Fazli [9] 
where the alternatives are Poisson processes with one-sided parametric intensities or Dachian and 
Kutoyants [7] where the alternatives are self-exciting point processes (namely, Hawkes processes). In 
the nonparametric framework, Ingster and Kutoyants |20] propose a goodness-of-fit test where the 
alternatives are Poisson processes with nonparametric intensities in a Sobolev or a Besov ball B 5 2 q {R) 
with 1 ^ q < oo and known smoothness parameter 5. They establish its uniform separation rate over 
a Sobolev or a Besov ball and show the adaptivity of their testing procedure in a minimax sense. 

In some practical cases like the study of the expression of neuronal interactions or the study of 
favored or avoided distances between patterns on a strand of DNA, such smooth alternatives (Sobolev 
or Besov balls) cannot be considered. Indeed, the intensity of the Poisson process N c in these cases 
may burst at a particular position of special interest for the neuroscientist or the biologist. So we 
have to develop a testing procedure able to distinguish a constant function (or here a null function) 
from a function that has some small localized spikes. These features are not well captured by using 
classical Besov spaces. Hence we focus in particular on alternatives based on sparsity rather than on 
alternatives based on smoothness. For this, we are interested in the computation of uniform separation 
rates over weak versions of Besov balls. Such alternatives have already been considered. For instance, 
Fromont et al. [11] propose non-asymptotic and nonparametric tests of the homogeneity of a Poisson 
process that are adaptive over various Besov bodies simultaneously and in particular over weak Besov 
bodies. Another example is Fromont et al. [12 \ which construct non-asymptotic and nonparametric 
multiple tests of the equality of the intensities of two independent Poisson processes, that are adaptive 
in the minimax sense over a large variety of classes of alternatives based on classical and weak Besov 
bodies in particular. 

The test $ Q proposed in this paper consists in a multiple testing procedure obtained by aggregating 
several single tests based on a wavelet thresholding method as in Fromont et al. |11| [T2] (they also 
consider model selection and kernel estimation methods). First, Proposition 2 proves that the multiple 
test is an a- level test and ITheorem 21 gives a condition on the alternative to ensure that our multiple 
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test has a prescribed second kind error. This result reveals two regimes as in |34] , Indeed our model 
presents a double asymptotic through the number n of parents and the recording time T (namely, the 
length of the observations interval), which is not usual. Since N p is a homogeneous Poisson process 
with constant intensity /J, p , the number n of points of N p falling into [0; T] is the realization of a Poisson 
random variable with parameter /U p T. As a consequence with very high probability, T is proportional 
to n and in this case, the uniform separation rates of the multiple test over weak Besov bodies are 
established by ITheorem 31 Thus, our testing procedure is near adaptive in the minimax sense over a 
class of such alternatives. The proofs of these results are essentially based on concentration inequalities 
(see [26J) and on exponential inequalities for [/-statistics (see |19j). Secondly, some simulations are 
carried out to validate our procedure, which is compared with the classical Kolmogrov-Smirnov test 
and a testing procedure proposed by Tuleau-Malot et al. |35j . which formalized a well-known procedure 
in neurosciences, namely the UE method (see Grim et al. [16J). 

The paper is organized as follows. Section 2 deals with the description of our testing procedure. 
Section 3 is devoted to the general results of the paper. The control of the probability of second kind 
error is ensured by ITheorem II for the single testing procedures and by ITheorem 2 1 for the multiple test. 
The uniform separation rates of the multiple test over weak Besov bodies are provided in ITheorem 31 
Section 4 presents the simulation study. The proofs of our main theoretical results are finally postponed 
in Section 6. 

To end this section we introduce some notations that will be used along the paper. We denote by 
dN p and dN c the point measures associated with N p and N c respectively. We denote by Po the distri- 
bution of the aggregated process N c under Hq, the distribution of N c whose intensity conditionally 
on Ui, . . . , U n is given by the function t i — > fx c + Y17=i — U%) for any alternative h and by the 
corresponding expectation. The uniform distribution on [0;T] is named n and ¥, n (f(U)) denotes the 
expectation of f(U) where U ~ it (an independent copy of Ui, . . . , U n ) for any measurable function /. 
For an orthonormal basis {(p\,\ E L} of a finite dimensional subspace Sl of 20R); we denote by Dl 
the dimension of Sl (namely the cardinal of L) and by Hl the orthogonal projection of h onto Sl- 



2 Description of our testing procedure 

In the sequel, we assume that h is compactly supported (there is a maximal time of synchronization 
during a task according to the neuroscientists). Without loss of generality, we suppose now that the 
support of h is strictly included in [— 1; 1] and that we observe the Ui's (the parents) on [0;T] and 
realizations of the process N c (the children) on [— 1;T + 1]. Consequently, h belongs to i(M), 2(P) an d 
oo(R). Then, we can consider the decomposition of h on the Haar basis denoted by {(fx, A G A}: 

h = ^ with (3\= h(x)(p\(x)dx, 

AeA ^ R 

where 

A = {\ = (j,k):j>-l,k£Z} 

and for any A £ A and any 

f <j>(x — k) if A = (~l,k) 
<P\{x) - | <2p/2^ 2 j x _ fc ) if A = (j, k) with j > ' 

with 

0=1[O;1] and ^ = l]i ; i] " 1[ ;I]- 

The functions <p and tp are respectively the father and the mother wavelets. Since the goal is to detect a 
signal and not to reconstruct it, the Haar basis is suitable in our context. Furthermore from a practical 
point of view, the use of the Haar basis yields fast algorithms, easy to implement. Nevertheless the 
theoretical results of the present paper can be generalized to a biorthogonal wavelet basis (see [6] for 
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a definition of this particular basis) as in |29| El El]- We precise that we can easily extend our results 
to a function h compactly supported in [—^4; A] for any A > by scaling the data by \A~\ + 1. 

By considering this wavelet decomposition of h, the null hypothesis 7~Lq: "h = 0" means that all 
the coefficients fix are null and the alternative hypothesis %\: "/i ^ 0" means that there exists at least 
one non-zero coefficient. Since h is strictly supported in [—1; 1], if one coefficient fii—\ t jA is non-zero, 
then there exists at least one coefficient fi(j t k) with j ' ^ which is also non-zero. Therefore, we focus 
only on the coefficients fiu^) with j and we introduce the following subset T of A 

r = {A=( Jl fc)GA:j^0,fce Kj) 

with ICj = {k £ 7* : — 2 3 ^ k ^ 2 J — 1} (JCj is the set of integers k such that the intersection of the 
support of (p\ and [—1; 1] is not empty, with A = (j, k)). 
For every A in T, the coefficient fix is estimated by 



n 



E 

i=i 



71 — 1 

ipx(x-Ui) E n (ip x (x - U)) 



n 



dN c (x). 



These estimates, inspired by those proposed in [34j for a simpler model, namely with [i c = 0, are 
unbiased: 



Proposition 1. For all A = (j, k) in T, fix is an unbiased estimator of fix- 

The proof of Proposition l| uses the fact that for all A in T, f\ V?aW dt = 0, avoiding boundary 
effects (see Section 6.1). 

In order to test the null hypothesis Hq: "h = 0" against "h ^ 0", namely "3A G T,/3a 7^ 0", 
we first propose to test for all A G T, the null hypothesis Hq against the alternative T-L\: "fix 7^ 0". For 
each A E r, the associated simple test actually consists in testing "fix = 0" against "fix 7^ 0" or more 
precisely, in testing the absence of variation of the function liona small interval. Then in a second 
time, we will aggregate these simple tests to test the nullity of h on its complete support. 



2.1 The single testing procedures 

Let us fix some a E]0; 1[ and A E T. We want to construct an a-level test of the null hypothesis %q: 
"h = 0" against %\: "fix 7^ 0", from the observation of the parents U\, . . . ,U n and the realization of 
the Poisson process N c . We notice first that the null hypothesis entails in particular that fix = 0. 
We introduce the testing statistic T\ defined by 

fx = \fix\- (2.1) 
Our single test consists in rejecting the null hypothesis when T\ is too large and more precisely, when 



[Ui,...,U n ;N c , to t] 



(a), 



where N c j t is the (random) number of points of the process N c falling into [— 1;T + 1] and for any 
m E N*, q L ^ 1 '"'' Un,m \a) is the (1 — a)-quantile conditionally on U\, . . . , U n of 



1 X,rn 



m n 

EE 

k=l 1=1 



Mv k °-u t 



n 



n 



-E^x(V k °-U)) 



(2.2) 



with (V]°, . . . ,Vm) a m-sample with uniform distribution on [— 1;T + 1] (namely a m-sample of the 
process N c under %o). We can easily prove that conditionally on U\, . . . , U n and N c j t = m , T x and 
T? have exactly the same distribution under Hq. Thus, the corresponding test function is defined by 



1^ [U lt ...,U„;N ctot ] ■ 



(2.3) 
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2.2 The multiple testing procedure 



Previously, we have built testing procedures based on each single empirical coefficient We propose 
in this subsection to consider a collection of empirical coefficients instead of a single one, and to define 
a multiple testing procedure by aggregating the corresponding single tests. 

Let {w\, A G r} be a collection of positive numbers such that X^Aer e ~ Wx ^ 1- This set allows us 
to put weights to empirical coefficients according to their index A = (j,k) G V. Given a G]0;1[, we 
consider the test which rejects T~Lo when there exists at least one A in T such that 

rn . [Ui r ..,U n ' } Nc,tot] f [Ui,---,U n ;N Cj tot] „—wx\ 
J A ^ <j\ \ a a e ), 

where 



[Ui,...,U n ;N C:to t] 



sup L > : P (max (f^ - q f^ U - N ^\ ue -^)) > 



The corresponding test function is defined by 



Ui,...,U n ;N c + ot < a 



(2.4) 



^« — 1 frf, [Ui,-,Un;N Ci tot], [Ul,-,V n ;N ctot \ ■ ( 2 - 5 ) 

max Asr IT A -g A K e "'A)l>0 



We mention that, since the set V is infinite countable, the number of tests to be performed is infinite 
and this is not a problem from a theoretical point of view. But in practice, we have to perform a finite 
number of single tests and so, we will fix a maximal resolution level jo and we will carry out the single 
tests $a,« f° r A = (j, A;) in V with j ^ jq. 

In the next section, we study the properties of the single tests $A,a defined by (|2.3p and the multiple 
test $ a defined by (|2.5p . through their probabilities of first and second kind errors. 



3 Main theoretical results 

3.1 Probability of first kind error 

We constructed our single and multiple tests in such a way that the first kind error, which measures 
the probability that the test wrongly rejects the null hypothesis, is less than a. 

Proposition 2. Let a be a fixed level in ]0; 1[. Then the single test <&\ t a defined by \2.3\) for any A G T 
and the multiple test <3? a defined by \2. 5\) are of level a. Furthermore, u ^ 1 '---' Un ' Nc ' tot ^ defined by \2.J$ 
satisfies u ^-' U «' N ^t] ^ a ^ 

This result shows that the tests are exactly of level a, which is required for a test from a non- 
asymptotic point of view (namely n and T are not required to tend to infinity). 

3.2 Probability of second kind error 

The second kind error, which measures the probability that the test does not wrongly reject the null 
hypothesis is not fixed by the testing procedure, unlike the first kind error. We have to control the 
probability of second kind error in such a way that it is close to 0, in order to obtain powerful tests. 
The following theorem brings out a condition which guarantees that the single tests have a prescribed 
second kind error. 

Since h belongs to i(M) and oo(M), we introduce R\ and two positive real numbers such that 
\\h\\\ ^ R\ and WhW^ ^ R^. 
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Theorem 1. Let a, (3 be fixed levels in ]0; 1[. Let Q and k be positive constants depending on f3, fj, C) 
Ri and Roo- For all A E T, let $\ a be the test function defined by \2.S\) . Assume that 



xa i / 2 C fl 1 2-in 



1 



+ k { 011 (2/ 



ft 



+ 



n 



+ 



2~* 



T 



+ ln(2/a) 



2?/2 2-^ 2% 
+ 



n 3/2 



nT 



(3.1) 



/or A = (j, A;). T/ien, 



A, a 



0)^/3. 



Note that the quantity 



I + L + 2- 



-^2- that appears under the square root of the first term of the 
right hand side of (|3.ip is of the same order as the upper bound of the variance of the estimates f3\ 
(see Proposition 1 of |34]). Consequently, the right hand side of (|3.ip can be viewed as a standard 
deviation term, since the other terms are not asymptotically larger than the first term if we assume 
that 2- J ^ n 2 , where asymptotic means n — > +oo or T — > +oo. 

IThcorcm II means that if the coefficient /3\ is far enough from 0, then the probability of second 
kind error is controlled. This result gives a threshold for f3\ from which our associated single testing 
procedure is able to detect a signal and shows that its power is larger than 1 — /3. Furthermore, if we 
consider the regime "T proportional to n" in order to compare our result with known asymptotic rates 
of testing, Condition (|3.ip can be easily obtained for instance if (3 2 > C/n by assuming that 2 J ^ n 2 , 
with C a positive constant. 

Now we are interested in the power of the multiple testing procedure and the following theorem 
gives a condition on the alternative in order to ensure that our multiple test has a prescribed second 
kind error. 



Theorem 2. Let a, /3 be fixed levels in ]0; 1[. Let $ a be the test function defined by 
that there exists at least one finite subset LofT such that 



Assume 



\h L \\ 2 2 > {C 1 D L + C 2 ^2w x 
V AeL / 



1 n 

n T 2 



+ (C 3 D L + C^w x + C 5 ^' 
\ AeL AeL 



2? L 1 



w n 



, (3.2) 



where ji = max{j : (j,k) € L with k € fCj} and C\, Ci, C%, C4 and C5 are positive constants 
depending on a, f3, fj, C) R± and Roo- Then, 



P fe (3> Q = 0) < P. 

This theorem means that if there exists one subspace Sl of 2(M) suc h that hi (the orthogonal 
projection of h onto Sl) lies outside a small ball around 0, then the probability of second kind error is 
controlled. This result gives a threshold for the energy of hi from which our multiple testing procedure 
is able to detect a signal and shows that its power is larger than 1 — /3. Furthermore, if we consider the 
regime "T proportional to n" in order to compare our result with known asymptotic rates of testing, 
Condition (|3.2p can be easily obtained for instance if > C x (D^ + X^AeL w ^ + X^AeL ^a) l n 

by assuming that 2- ?i ^ n 2 , with C a positive constant. Then, the separation rate between the null 
and the alternative hypotheses is of order D^jn and this is typical for testing procedures based on 
a thresholding approach (see |11| [T2] for instance). Usually, nested tests (namely based on model 
selection) achieve a faster rate of separation of order ^D^/n (see [TJ [2] for instance). But these 
latter tests are not adaptive over weak Besov bodies. Consequently, the separation rate established by 
ITheorem 21 leads to sharp upper bounds for the uniform separation rates over such particular classes 
of alternatives and so, our multiple testing procedure will be proved to be adaptive over particular 
classes of alternatives, based on weak Besov bodies. 
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3.3 Uniform separation rates 

Given some ot,/3 G]0; 1[, we have previously built an a-level test $ a defined by (|2,5p . with a probability 
of second kind error at most equals to j3 if Condition (|3.2p is satisfied. Then, given a class 5,5 of 
alternatives h, it is natural to measure the performance of the test via its uniform separation rate 
Ss, (3) over S$ (see |T]) defined by 

p($ Q , S s , P) = inf I p > : sup F h {$ a = 0) ^ /3 I . (3.3) 

[ /lG l Si,||/l||2>P J 

In order to compare our result with known asymptotic rates of testing, we consider the regime "T 
proportional to n" in this subsection. 

We introduce for S > 0, R > the Besov body 

^oo(^) = I / € 2 W : / = E PW*' ^ > °> E ^) < 
[ AeA keKj 

We also consider a weaker version of the above Besov bodies defined for p > 0, R' > by 



(R') = i / € 2 (M) : / = V /3\(f\, sup S p ^ 1|^ A | >S < &*> 1 
I AeA s>0 A G r J 



Whereas the spaces ^ (i?) constitute an ideal class to measure the regularity of the possible alterna- 
tives h, the spaces W*(R') constitute an ideal class to measure the sparsity of a wavelet decomposed 
signal h. Indeed, if / = X]agA /^a^a £ VVp(-R'); then the associated sequence f3 = (f3\)\£r satis- 
fies sup£ gN * ^ 1 / p |/3|(Q < oo, where the sequence is the non-increasing rearrangement of /3: 
\/3\m ^ |/3|(2) ^ • • • ^ |/3|(^) ^ • • •■ This condition gives a polynomial control of the decreasing rate of 
the sequence (|/3|(£))f and smaller p, sparser the signal. There exists an embedding between Besov and 
weak Besov balls: 

4(fi)cW*j_(r), 



l+2i 



where the radius r of the weak Besov ball depends on 5 and R (more precisely, r = A s R/\/2 2t> - 1). See 
|21| l32| [33] for more details and for extensions in a more general setting. So, we consider in this paper 
such alternatives based on the intersection of Besov and weak Besov bodies, namely sparse functions 
with some minimal regularity, see below. 

To evaluate the uniform separation rates, we choose the following collection of weights {w\, A £ T} 
defined by 

w A = 2(ln(i + l)+ln(vr/V6)) + hi|/Cd, (3.4) 

for any A = (j,k) S T, where \fCj\ is the cardinal of tCj which is of order 2 J . With this choice, the 
collection of weights satisfies the condition Ylxer e ~ Wx ^ ^ ne following theorem gives the uniform 

separation rates over B^^iR) H W * 2 (R') , for 7 ^ <5. 

1+27 

Theorem 3. Let a, (3 be fixed levels in ]0; 1[. Assume that T is proportional to n. Let & a be the test 
function defined by \2. 5\) with the weights w\ 's defined by (fff.^| ). Then, for any 5 > 0, 7 > 0, R > 0, 
R' > 0, if 25 > 7/(1 + 27) 



lnn\ !+ 2 t 



p(<f> a ,B d 2>00 (R)nW*_^(R'),l3)^C 

1+27 \ n 

with C a positive constant depending on 5, j, R, R' , a, (3, p c , R\ and R^. 
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Note that this result holds for instance with 5=1/4. It corresponds to the minimal regularity 
mentioned previously. 

IThcorcm 31 illustrates the optimality of our testing procedure in the minimax setting. Indeed, 
considering the regime "T proportional to n", uniform separation rates of the test $ a match the 
minimax separation rates obtained by Fromont et al. [11] [12], if 25 > 7/(1 + 27) and also S < 7/2 and 
7 > 1/2 (see Theorem 1 of [11]). Furthermore, the upper bound of uniform separation rates of our test 

$ a over ^^(^nW* 2 (R') has already obtained, up to a logarithmic term, for a wavelet thresholding 

' 1+27 

estimation method proposed by Sansonnet |34| in a very similar context and more precisely, this is 
equal to the minimax estimation rates of the maxisets of the thresholding estimation procedure (see 
[21[ [29] [33] for more details). This means that it is at least as difficult to test as to estimate over such 
classes of alternatives. Note that on Sobolev or Besov spaces, testing rates are usually faster than 
estimation rates. 

4 Simulation study 

In this section, we study our testing procedure from a practical point of view and we compare it to 
the conditional Kolmogorov-Smirnov (KS) test and a Gaussian Approximation of the Unitary Events 
(GAUE) method developed by Tuleau-Malot et al. j35], 

4.1 Description of the data 

We create different data sets that are to a certain extent a reflection of a neurobiological reality. We 
consider the spike trains of two neurons N p and N c which are modeled by two point processes with 
respective conditional intensity \ p and A c defined by f| 1 . 1 [) . 

For real spike trains it is not reasonable to postulate the stationarity of N p and N c , i.e. [i p and \i c 
are constant and considering the same function h on the entire recording period [0; T] (see Griin et 
al. |16j). But this assumption is quite feasible on smaller time ranges (see Griin |15] and Grammont 
and Riehle |14]). However, to date, we have no algorithmic and statistical tool to clearly identify the 
stationarity ranges. Several methods (UE and MTGAUE, see |35] for instance) propose to perform 
many tests on different small windows of time and to use multiple testing procedure (see Benjamini and 
Hochberg [3] for instance) to combine them. Hence those methods can solve, at least in practice, this 
stationarity problem. The aim of this simulation study is not to show how our testing procedure can 
be incorporated in a Benjamini and Hochberg's approach, which lies outside the scope of the present 
paper, but to discuss the advantage of our method on one small window of time. This explains the use 
of the simulated data described below. 

We need therefore to simulate dependence between N p and iV c on [0; T] and to take into account 
the major part of the neurobiological reality. So, we simulate processes N p and N c whose intensities 
are respectively given by 

A p = 50 and A c = 20 + / h(t — u) dN p {u) , with h = OI^.q i]- 

J — OO 

In order to evaluate the performance of different procedures, several parameters 9 and v are tested. The 
parameter 6 represents the influence strength of N p on N c : larger the parameter 6 and more important 
the influence of N p on N c . The parameter v introduces a possible minimal delay in the synchronization, 
i.e. the synchronization of the neuronal activity occurs with a delay 5 uniform on [v; 0.01]. We consider 
nine different data sets denoted Datao, Data\o, Data^o, Data^o, Dataso, Dataio r , Data^or, Data$o r 
and Data$o r . For k £ {0,10,30,50,80}, Data,)- is simulated with 8 = k and v = while Data^ is 
simulated with 9 = k and v = 0.005. 

4.2 The Kolmogorov-Smirnov test 

First, we look at the performance of the Kolmogorov-Smirnov test (see Darling [8j) to convince us that 
this commonly used test is not reliable in this context. Indeed, even if the KS test is not a test of 
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Data set 


Power for KS 


Data\Q 


0.040 


Dataso 


0.051 


Dataso 


0.087 


Data$o 


0.113 


DataiQr 


0.054 


Datasor 


0.059 


Datasor 


0.053 


Datasor 


0.073 



Table 1: Power of the KS test with level a = 0.05, evaluated for various interactions. 



independence, the KS test may answer the problem. Since as said before, under %$ and conditionally on 
Ui, . . . ,U n and N Ci tat = m i the observations of N c are i.i.d. with common law the uniform distribution 
on [—1; T + 1], looking for the adequation of N c with this law could be an idea to detect the rejection 
of %q. So, the use of the KS test is relevant. 

First, we focus on the empirical rate of the type I error which is an approximation of the level of 
the test. Thus, we simulate 5000 independent realizations of Datao, simulations on which we perform 
the KS test with level a = 0.05. The empirical rate of the type I error evaluated on those data is 0.051, 
which is as desired. 

What about the number of wrong rejections of %i! We consider the power of the tests which is the 
proportion of correct rejections of Hq. We simulate 1000 independent realizations of Dataio, Data^o, 
Data^Q, Dataso, DataiQ r , Data^Q r , Data^Q r and Datasor, data on which we perform the KS test with 
level a = 0.05 and we evaluate the empirical power of the test. Table [1] summarizes the obtained 
results. The KS test power is extremely close to the expected level. Hence the KS test is not able to 
clearly detect the dependence. 



4.3 The GAUE method adapted to our context 

Before comparing both methods, we briefly return to the principle of the GAUE method. The aim 
of the GAUE method is to detect the dependence on a single window [0;T]. This method is based 
on the coincidences with delay. More precisely for the couple of processes (N p ,N c ), we compute the 
number of coincidences with delay 5 on [0; T], i.e. the variable Xt = Jjo-T] 2 ^-\x—y\^S dN p {x) dN c (y), that 
represents the number of pairs (x, y) in N p x N c such that \x — y\ ^ 5. Let us define A p = N p ([0; T])/T 
and A c = N c ([0;T])/T where iV p ([0;T]) and iV c ([0;T]) denote respectively the number of spikes of N p 
and N c among [0;T]. The quantities X p and A c are estimators of X p and A c . 

We reject the null hypothesis Hq: "/i = 0" when Xt ^ rriQ + ou\_ a j2^ where tuq = X p X c (2T5 — <5 2 ), 
a 2 = X p X c (2T5 — 5 2 ) + A p A c (\ p + X^j (|5 3 — y^ 4 ) and u 1 _ Q / 2 is the (1 — a/2)-quantile of a standard 
normal. This threshold comes from the theory developed in [35j and is adapted to our context. The 
quantity ttiq is a plug-in estimator of the expectation of Xt under T-Lq and a 2 is an estimator of the 
variance. It can be shown that under the assumptions "N p and N c are Poisson processes" and "N p 
and N c are stationary", this test is asymptotically of level a. Further details about the meaning of 
those different estimators are given in |35j . 

The GAUE method was developed jointly with a neurophysiologist and it fits in line the UE method 
developed by Grim and coauthors (for instance, see |15] and [16j), which is a commonly used method 
in neurosciences. One of its main disadvantage is that S has to be chosen beforehand. Part of the aim 
of this work is to propose a more adaptive method. 
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Data set 


our procedure 


GAUE 


Datao 


0.0486 


0.0446/0.0510/0.0548 



Table 2: Empirical rate of type I error associated to our procedure and the GAUE method. The 
theoretical level is a = 0.05. Since the GAUE method depends on the tuning parameter 5, the given 
value is the minimum/median/maximum of the empirical rate over all the 5. 



4.4 Our procedure in practice 

From a theoretical point of view, considering all the resolution levels is not a problem. However, in 
practice, we have a maximal resolution level, denoted jo in the sequel. We choose jo quite small for 
time computational reasons (jo = 3). Thus, it is better to have the support [—^4; A] of h with A close 
to 1. Nevertheless, if in addition to a global detection, we are interested in a more local detection, 
i.e. the coefficients A £ T for which & a ^\ rejected, jo should not be too small. For instance, if h = 1[o ;j 4] 
and if the order of magnitude of A is 2~ J or 1 — 2~ J , with J > jo + lj our procedure does not allow to 
detect locally the jump of h at A. Consequently, taking A close to 1 /2 may appear reasonable. Hence, 
the data are multiplied by 50 before being treated. 

Let us recall that our test rejects Hq when there exists at least one A = (j, k) in V with j ^ jo such 
that 

rr ^ [Ui,...,U n ;N c tot] i \Ui,...,U n ;N c tot\ —w\\ 
1\ > q x [u a e A J, 

where jo ^ 1 denotes the maximal resolution level, %1 ^ 1 '---' Un ' Nc ' tot ^ j s defined by (|2.4p and the W\s 
are given by (|3.4p . Hence, for each observation of the process N c whose number of points is denoted 
by N Cj tot = iri) given the points of N p denoted U\, . . . , U n , we estimate v j£ r i'---> u ™< m } an j ^ ne q U antiles 
q[Ui,...,U n ,m] ^ classical Monte Carlo methods based on the simulations of B independent sequences 
{V b , 1^6^ B}, where V b = (V^, . . . , V^) is a m-sample of uniform variables on [—1; T + 1] (i.e. the 
law of N c under T~Lo, conditionally on Ui, . . . , U n and N Ct tot = m )- We fix B = 20000 in the sequel since 
for larger values of B, the gain in precision for the estimates of v j^ 1 '---' Un ' rn ^ anc i gP 1 '"' ' Un > m i becomes 
negligible. We define for any A = (j, k) in T with j ^ jo, for 1 ^ b ^ B: 

We compute these T?'^Js with a cascade algorithm (see Mallat |25|). 

Half of the m-samples is used to estimate the quantiles by putting in ascending order the T?' 's for 
any A. The other half is used to approximate the conditional probabilities occurring in (|2.4p . Then, 
u [Ui,...,U„,m] . g Qbf-gjjjgd by dichotomy, such that the estimated conditional probability occurring in 
(|2.4p is less than a, but as close as possible to a. Choosing jo = 3, our procedure considers 15 single 
tests $A,a involving wavelets whose support length is respectively 0.125, 0.25, 0.5 and 1. This allows 
us to make detections at the positions m x 2 -3 (m in {0, . . . , 7}) with a range of 2~ 3 . Due to the 
scaling of the data in our procedure, we need to divide the positions and the range of the possible 
detections by 50. Consequently, in the real time, the positions and the range become m x 0.0025 (m 
in {0, ... , 7}) and 0.0025. 

4.5 Results 

We compare our testing procedure and the GAUE method on the different data sets. As for the KS 
test, we look first at the level of both tests. We simulate 5000 independent realizations of Datao, 
simulations on which we perform the present method and the GAUE ones with a = 0.05. For the 
GAUE, the tuning parameter 5 varies on a regular grid of [0.001; 0.04] of step 0.001. The order of 
magnitude of S is similar to the range of the possible detections done with our method. On those data, 
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Data set 


our procedure 


GAUE 


Data±o 


0.095 


0.068/ 


'0.1085; 


'0.168 


Data^o 


0.478 


0.154/ 


'0.3795/ 


'0.707 


Dataso 


0.864 


0.278/ 


'0.6645/ 


'0.953 


Dataso 


0.993 


0.451/ 


'0.9160/ 


'0.998 


Dataio r 


0.073 


0.047/ 


'0.0575/ 


'0.077 


Data^Qr 


0.282 


0.050/ 


'0.1415/ 


'0.277 


Data^Qr 


0.664 


0.053/ 


'0.2825/ 


'0.589 


Datasor 


0.968 


0.048/ 


'0.4900/ 


'0.879 



Table 3: Powers associated with our procedure and the GAUE method, evaluated for various interac- 
tions. The theoretical level is a = 0.05. Since the GAUE method depends on the tuning parameter S, 
the given value is the minimum/median/maximum of the empirical rate over all the 5. 



we evaluate the empirical rate of type I error. Those results, for both methods, are summarized in 
Table [2j both testing methods seem to have a correct level in practice. This means that the number 
of wrong rejections of Hq is well controlled. 

Now we want to see if the number of wrong rejections of Hi is also controlled. To evaluate the power 
of both tests, we simulate 1000 independent realizations of Data\Q, Dataso, Dataso, Dataso, Dataior, 
Data^, Dataior and Datasor as for the KS test (see Section 4.2). The results of the empirical power 
are given by Table |3j both methods are comparable in terms of power when u = (i.e. for the Data^)- 
However when u ^ 0, our method seems to have better performance since the power is higher. 

Moreover, if both methods are comparable in terms of performance, it remains that the testing 
procedure proposed in this paper has an advantage over the GAUE method. In fact, our method is 
statistically adaptive. Indeed, the parameter 5 which appears in the GAUE method is not calibrated 
in practice. In our method, we aggregate the single tests over (j,k). So on one hand, we do not 
need to specify this parameter but just an upper bound jo, the maximal resolution level: the method 
through weights (|3.4p . adapts to this unspecified parameter (j,k). But on the other hand, by looking 
at the single tests <&\^ a that have supported the rejection, we are able to partially recover an important 
information for the practitioner: the position {k2~ :) ) and the range (2~- J ) of the influence. In fact, by 
looking only at this single testing procedure, we get an upper value for 0.01 and a lower value for v 
on the range of delay 5 of synchronization. To obtain more precise estimations of the support of h, we 
can consider an estimate of h, for instance the one proposed by Sansonnet |34| . The capacity to our 
method to get an information on v is due to the fact that for a resolution level j we consider different 
positions k. This is not possible with the GAUE method. This explains why the results on Dataio r , 
Datasor, Data^r and Dataior are better with our method. 



5 Conclusion 

In our paper, we have investigated the influence of a point process on another one. We have built a 
multiple testing procedure based on wavelet thresholding. The main results of the paper have revealed 
the optimality of the procedure. Furthermore, our test is adaptive in the minimax sense over classes 
of alternatives essentially based on weak Besov bodies. Then, from a practical point of view, our 
method answers several practical questions. However, a number of challenges remain before applying 
our method on real data. To overcome the problem of stationarity, we could use a Benjamini and 
Hochberg's approach as for the GAUE method. Finally, we could consider a more sophisticated model 
that takes into account the phenomenons of spontaneous apparition and self-excitation (as for the 
complete Hawkes model). But this model raises serious difficulties from the theoretical point of view. 
This is an exciting challenge. 
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6 Proofs 



All along the proofs, we introduce some positive constants denoted by C(£, . . .) meaning that they may 
depend on £, .... They do not depend on j, n and T (which drive the asymptotic). Furthermore, the 
values of these constants may vary from line to line. 

We recall that {ipx, A G A} is the Haar basis and consequently, we have: 

||^a||i = 2~ j/2 , \\cpxh = 1 and ||<£a||oo = 2 i/2 . 

In the case of a biorthogonal wavelet basis, H^aIIii II^Alb and ||v?a||oo are of the same order as above, up 
to a positive constant respectively depending on [|^>||i, \\1ftW2 anci Halloo; where ip is the mother wavelet 
associated to the considered biorthogonal wavelet basis. Consequently, the same proofs potentially lead 
to the results on a biorthogonal wavelet basis as well as in |34| for the wavelet thresholding estimation. 



6.1 Proof of Proposition 1 



We first notice that for any A in T, for any u £ [0; T], 

fT+l 



<p\(t — u)dt = 0. 



-1 



Let A € r be fixed. By considering the aggregated process (|1.2p . we can write 
with 

r n r „ 1 1 

dN°(x) 



(6.1) 



(6.2) 



G°M = f £ 

i=i 



ft — 1 

<p X (x-Ui) E w (cp x (x-U)) 



n 



and 



~ n 

GM = / Yl 

i=l 



Tl — 1 

ip x (x - Ui) E n (ip x (x - U)) 



n 



3=1 



On the one hand, we notice that G((p x ) is the same quantity as the one defined by equation (2.3) of 
|34| . Thus, by applying the first part of Proposition 1 of |34| . we obtain 

K(G((p x )) = n / ip x (x)h(x) dx. 
Jr 

On the other hand, we have 

G°W= / ^ x (x-U l )dN Q c {x) + V / [ip x (x-Ui)-^{ip x (x-U))] dN°(x). 
JR i=2 JR 



Thus, 



/T+l r'J.+x 
<p x (x-Ui)ii c dx + Y^J i [tfxix-Uj-E^ipxix-UWucdx 

and by using (|6.ip . we obtain 

E(G°(y> A )) = / E[< Px (x-U i )-E n (<p x (x-U))]ii e dx = 0. 

■ o J — 1 



i=2 



i=2 



Finally, 



E(/3 A ) = E 



\ n 



<px(x)h(x) dx = fix, 



which proves Proposition 1| 
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6.2 Proof of [Proposition 2 



Let a be a fixed level in ]0; 1[. Let A £ T be fixed. First, the probability that the single test defined 
by (|2,3p wrongly detects a signal is 

Po($A,« = 1) = Po (f X > q [ ^'-' Un ' Nc - tot] (a)) . 



Since conditionally on U\, . . . , U n and N C: tot, T\ and f® N tot have exactly the same distribution under 

mantile of T\ | U\ , . 

P ($A,a = 1) < a 



qr^ 1 '""'^™'^ '* '] (a) is also the (1 — a)-quantile of T\\U\, . . . , U n ; N c ^ Q t under 1~Lq. Thus 



and the level of the single test is a. 

Then, the probability that the multiple test defined by (|2,5p wrongly detects is 

P ($ Q = 1) = P (max (f x - ^■•••^"^.^(^'•••- C7 »;^^*] e -«A)^ > . 
By definition $2A§ of U ^-^ N ^\ 

P (max (t a - g p.-.^; J ^.^( ti fi.-. p »^^e- w *)) > 

because conditionally on U\, . . . ,U n and N c j t, T\ and T® N ^ tot have exactly the same distribution 
under %o- By taking the expectation over Ui, . . . ,U n and N c tat, we obtain that 

P ($ a = 1) < a 

and the level of the multiple test is a. 

Furthermore, by Bonferroni's inequality we have 



Ui, . . . ,U n ;Nr to t < " 



Aer 



[Ul,...,U n \N c ,tot] f — Wx \ 



TJ\, . . . , f/ n ; AT C)t0 i 

U\, . . . , U n \ N c tot 



Aer 



and consequently v ^ 1 '---' Un ' Nc ' tot ^ ^ a by definition (|2,4p of n ^ /l '--- ,f/n ' Ar<: ' t °'] ) w hich concludes the proof 
of [Proposition 2| 

6.3 Proof of ITheorem II 

Let A G r be fixed. Here we want to find a condition which will guarantee that 

n($A,a = 0) < /3, 

given ft G]0; 1[. 

Let us introduce 9 1 f _ / g / / 2 the (1 — /3/2)-quantile of the conditional quantile q^ 1, "' ,Un ' Nc ' tot \a). Then 
for any h, 



TO n\ ttt) / rj-i ^ [U\,...,U n ',N c tot\ I \ \U\ ,---,Un;Nc,tot] / \ ^ a 

Pft(^A,a = 0) = F h [T x < q\ (a), 9a W < tfi-p/a 

+ Pft (^A ^ 9 A W ' ^A W > 91-/3/2 J 
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and a condition which guarantees ¥h(T\ ^ Qi-p/2) ^ P/^ wu ^ be enou gh to ensure that 

Pfc(*A.a = 0) < 13. 
The following lemma gives such a condition. 

Lemma 6.1. Let a, f3 be fixed levels in ]0; 1[. For any A = (j, k) E T, if 



MTx)>J^f£ + q?„ m (6.3) 
for a particular £ which is a positive constant depending on \i c , R\ and Rqo, where 



11 2^n 

n 



Qj,n,T — — + + rp 2 



then 

Pfc(T A < 5?_# 2 ) < 0/2, 

so that 

>X,a = 0) < P. 



The proof of this lemma is postponed in Section 6.6.1. 

In order to have an idea of the order of the right hand side of (|6.3p . we are now interested in the 



control of (JfJLo^) the (1 — /3/2)-quantile of q L ^ , '"' Un,Nc ' tot \a). A sharp upper bound for 9"__ /3 / 2 ^ s gi ven 
by the following lemma. 

Lemma 6.2. Let a, (3 be fixed levels in ]0;1[. For any A = (j,k) £ T, there exists some positive 
constant k depending on j3, \i c and R\ such that 

1 1 2-^ 2 ^E\ , . . . / 2i I 2 2-i/ 2 \\ 



qU/2 ^ « |VW2M ^ + ^ + — ^ j +M(2/a) ^ + — 

The proof of this lemma is postponed in Section 6.6.2. 

Now, observe that if Condition (|3.ip of lTheorem H is satisfied, namely 



w > ^ + „ { ^ (_L + _L + £^5) + „ (2/a) + ^ 

then bv lLemma 6.21 



< XQj,n,T , a 
—p—+Vl-f)/2- 



We notice by Jensen's inequality that = |1E^ (/?a) I ^ E/i(|/3x|) = E/ l (7\). Thus, Condition (|6.3p of 
ILemma 6.11 is satisfied and by ILemma 6.11 

Pfc(*A,a = 0) < /?, 

which concludes the proof of lTheorem II 
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6.4 Proof of ITheorem 21 

c . [Ui,...,U n ;N 0lta t] 



^ a (see Proposition 2 ) and by setting a\ = ae Wx , we have 



F h {$ a = 0)=F h (VA G r,f A < ^'■■■' f/ «^.^]( n ^'---' f/ «^^] e -«'A; 
Agr \ 

as soon as there exists A in T such that P/i(<&A,a A = 0) ^ /3. 

First, let us give the precise values of the constants that appear in Condition H 3 . 2 [) of ITheorem 21 

d = 8^ + 3K 2 ln(2/a)^, C 2 = 24k 2 , C 3 = 8k 2 In 2 (2/a), C 4 = 16k 2 In (2/a) and C 5 = 8k 2 , 



where £ and k are the constants defined respectively by ILcmma 6.11 and ILemma 6.21 We recall that 

<3?>,T = ^ + T + and we den ote Rj, n ,T = + J^. 

Let us assume that there exists one finite subset L of T such that Condition (|3.2p of ITheorem 2 1 is 
satisfied. Thus, 



fall! > 8 ( ^ + 3K 2 ln(2/a) W+3k 2 J] 



xeL 



1 n 
n T 2 



+ 8k 2 In 2 (2/a)D L + 16k 2 In (2/a) ^ w A + 8k 2 ^ 
Since In (2/a) + w\ = In (2/a A ), 

E^A>E{ 8 (| +3^ 2 ln(2/a A ) 
AeL agl ^ ^ 

and it implies that there exists one coefficient A = (j, k) in V such that 



-1 



2^ 1 

— + 

n A n 



2 T 2 



1 n 
n T 2 



+ 8K 2 ln 2 (2/a A ) 



2^ 1 



n J n 



2^2 



>s(| + 3K 2 ln(2/a A ) 



1 n 
n T 2 



2J 4- i 



+ 8k 2 In 2 (2/a A ) 
we have: 



2^ 1 



r n 



2j'2 



Seeing that Qj, n ,T ^ 2 + ^] an d ^',n,T ^ 

0\ > ^Qj,n,T + 12k 2 In (2/a x )Qj,n,T + 8k 2 In 2 (2/a x )Rj, n ,T- 
Since (^a + v^) 2 ^ 2(a + b) and (^a + v^+v^) 2 < 3(a + 6 + c) for all a, 6, c nonnegative real numbers, 



^ > 4 |<3^,t + 4k 2 ln(2/a A ) 



1 1 2~^ 2 -s/n 



+ 



+ 



+ 4k 2 In 2 {2/ a x ) 



' 2 i/2 2 ^'/2 v 



and then, 



I 2C I /ll 2^'Z 2 /n \ / 2^ 2 2~H 2 

f&>\fl fQ*? + « | V^W^) (^ + ^ + -5^- J + m (2/aJ + — 

Finally, it is equivalent to 



which is exactly Condition (|3.ip of ITheorem II and we conclude the proof of ITheorem 21 by applying 
ITheorem II 
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6.5 Proof of ITheorem 31 

With T proportional to n, Condition (|3.2p of ITheorem 21 is satisfied if there exists one finite subset L 
of r such that 



\l ^ \\h - h L \\\ + C(a,f3,/j, c ,Ri,R a 



D * + E w >] \ + ( °l + E + E w i ) K \ , 

v AGL / \ AeL AeL / ) 



with j L = max{j ^ : (j,k) G L with G JCj}, ^AeZ W A < C x (j L + 1)D L and Yj\aL w \ ^ 
C x (jx + \) 2 Dl. Consequently, Condition (|3.2p is satisfied if there exists one finite subset L of T such 
that 

IHI2 > Ml + C(a,P,fi c ,Ri,R 00 ) ( jL + l) D L , (6.4) 

n 

with the maximal resolution level jl such that 2 jL ^ n 2 /lnn. 

Let J ^ 1 that will be chosen later. We consider the following finite subset Tj of T 

Tj = {\ = (j,k)GT:0^j^J,k€ K 3 }. 

We introduce for all integer D ^ \Tj\ the subset L of Tj such that {/3aiA G L} is the set of the D 
largest coefficients among {f3\ , A G Tj}. We can notice that 



\\ h ~ MI2 = 11^ ~ h Tjh + \\ h Tj ~ hzh- 
On the one hand, since h belongs to B^^iR), 

\\h-h Tj f 2 = EE^m^ c ^ R22 ' 2J5 - 

j>J fce/Cj 

On the other hand, using equivalent definitions of weak Besov balls given by Lemma 2.2 of [21] and 
using for instance page 211 of [11] . we obtain: 

Whvj-hLWl^C^R'^D-^, 

since h belongs to W* 2 (R'), with R" an absolute positive constant depending eventually on 7 and 
R'. 

Taking 

J = Llog 2 (n £ )J + 1 



for some < e < 2, we obtain that the right hand side of (|6.4p is upper bounded by 

C{8, 7, R, R', a, p, Me, i?oo) (n- 2e5 + -D -27 + • 

Taking D = In n) 1 ^ 1+27 - l J and e > 7/(5(1 + 27)), we obtain that the right hand side of (|6.4p is 
upper bounded by 



0(6,7, R,R' ,ot,P,IJ, c ,Ri,Rooj ■ , 

\ inn 



-27 
n \ 1+27 



when 25 > 7/(1 + 27) and so, 



/ n 



p^ a ,B'(R) n W*_j_(Bf), P) ^ C(8, 7, R, R', a, (3, /x C) i^) — 



1+27 



Vlnn 



1+27 



which concludes the proof of ITheorem 31 
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6.6 Proof of lemmas 
6.6.1 Proof of ILemma 6.11 

Let A G r be fixed. From Markov's inequality, we have that for any x > 0, 

Var(f A ) 



T x -E h (T x 



> x ) € 



(6.5) 



Let us control Var(T A ) = E^(T^) — E^(T\)- We easily obtain by Jensen's inequality and by considering 
the decomposition (|6.2p of Q((p\): 

Var(T A ) ^ Var(/3 A ) 

< ^Var(G°(^ A ) + G(v9 A )) 



with 



^ — [Var(G°(v9 A ))+Var(G(^ 



71 ^ 2 3 71^ 

Var(G(^ A )) < G^i^) ^ n + — + 



by applying the second part of Proposition 1 of [34J. It remains to compute Var(G°((/? A )). For this 
purpose, we apply the same methodology developed in Section 6.1.2 of [34]. We have the following 
decomposition of Var(G°(</? A )) into two terms: 

Var(GV A )) = E(Var(G°(^ A )|[/ 1 , . . . , U n )) + Var(E(G°( V 9 A )|C/ 1 , ...,U n )). (6.6) 

We start by dealing with the first term of (|6.6p . We have 

Var(G (( / 9 A )|C/ 1 ,...,L7 n ) 



T+l / n 



,i=l 



yi — X 

<px(x - Ui) E^ x {x - U)) 

n 



H c dx 



[T+l ( n \ 

fi c J ^ ( t Px (x-U 1 ) + Yji^( x ~Ui)- E^xix-U))}] dx 

rT+l [T+l n 

/j, c ip 2 x (x - Ui) dx + 2fi c / ip x (x - Ux) ^ [(f X (x - Ui) - E n (tp x (x - U))\ dx 
7-i J-i i=2 

[T+l n n 

+ Vc Yl E ^ x ~ U $ ~ E -(^(z - U))\ [ip x (x - U k ) - E T (ipx(x - U))\ dx. 



i=2 k=2 

In the first integral, write y = x — U\. So 
Var(G°(^ A ) | U Xl ...,U n ) 

2 



ip x (x - Ut) V [ifx(x - Ui) - E^ A (x - U))\ dx 
- 1 i=2 

[l+l n n 

+ ^ / EE lf^ x - Ui ) - E -(^a(x - U))\ [<px(x - U k ) - E^x(x - U))] dx. 

i=2 k=2 



HcW\\\2 

(■T+l " " 



Thus, 



E(Var(G° (c^Olt/i, ...,U n )) = Mcll^Hl + f> {^p x (x - U) - E n (<px(x - f/))] 2 ) dx 

i=2 
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T+l 



= Mcll^Alb + ( n - !)Mc y Var 7r ((£ A (x-7))dx 
< Mc || ¥ , A ||i + („_i) Mc (r + 2)%i 



< C(fi c )n, 

by using (|6.ip and Lemma 6.1 of [34J. 

Now, we deal with the second term of (|6.6j) . We have 

rT+l 



E(G°(</> A )|7i,...,7 n ) = ^ ^(x-U^dx + Y^J ^ [^(x-Ui)-^ x (x-U)))^ c 

n rT+l 

= Vpy] W\{ x - Ui) - E^OaO - 17))] dx, 

i=2 

X^y i [^a(x -C/,)- E w (<^ A (x - C/))] dxj 



by using (J6JJ). Therefore, 
Var(E(G°( 93A )|[/ 1 ,. 



jUg(n — l)Var 



<Mc("-l)E 



T+l 

[^(x-C^O-E^aCx -7))] dx 

T+l n 2 

|(/? A (x - 7i)| 



< C{fi c )2~ j n. 

Finally, by combining inequalities (|6.6j) . (|6.7p and (|6.8p . we obtain: 

Var(G°( 93A )) < C(/i c )n. 

Thus, 



Var(T A ) < 



c 



71- 



n 2 2 in 3 



^ (Qj,n,T, 



with 



1 1 2"% 

_+ T 

n 1 

and £ a positive constant depending on /_i c , i?i and i? c 



Qj,n,T — h 7F> + r » 

n 1 1 z 



Taking x = y / 2CQj >rti j'//3 in (|6.5p and using the previous inequality leads to 



r A -E fc (r A ) > ,/2cg J> , T //3 ^ 



Therefore, if E h (f x ) > y/2CQj, n ,T/P + Q?_^ /2; then 

n(Tx < <Z?_# 2 ) = Ph(T A - E/ l (T A ) < - E h (f A )) 

f A - E/ t (T A ) ^ E fc (f A ) - <£_ 



/3/2 



T A - E h (T x ) > J2CQj, n ,T/P 



and so 

which concludes the proof of lLemma 6.11 



n($x, a = 0) ^ A 
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6.6.2 Proof of [Lemma 6 .21 

We focus first on the control of the conditional quantile (^ 1 '"'' Un,Nc ' tot \a). For all m € N*, the 
(1 — a)-quantile gP 1 ''"'^™'"^ (a) is the smallest real number such that 



1 X,m > "\ 



[U 1 ,...,U n ;m] 



(a) 



Ui, . . .,U n ;N Citot = m) ^ a, 



where T? is defined by (|2,2p . Let m 6 N* be fixed. We write 



rpO 



n 



E<%>aX^°) 

k=l 

where (Vj , . . . , V^Pj is a m-sample with uniform distribution on [— 1; T + 1] and for any v G [— 1; T + 1], 

%*)(«) =E 



i=l 



7i — 1 

- Pi) E,r(VA(f " CO) 



Since E(</3 A (V — U)\U) =0 for independent random variables U and V uniformly distributed on [0;T] 
and [— 1; T + l] respectively, the S((px)(VS)'s are centered and independent conditionally on U±, . . . , U n . 
Then we apply Bernstein's inequality (for instance, see Proposition 2.9 of [26J) to get that for all uj > 0, 
with probability larger than 1 — 2e~ w , 



fc=i 



< A /2mV a r( < S( V 9 A )(y i °)|C/ 1 , . . . , C/ n ) W + ^ sup |%> A )(* 

V ue [_l;T+l] 



Thus, with probability larger than 1 — a, 

fl m ^f(U 1 ,...,U n ;m), 

with 

/([/!,..., U n - m) = -{ V2mln(2/a)V s + 

n 3 



(6.9) 



where 



y 5 = Var(5(( / 9 A )(F 1 )|C/ 1 ,...,C/ n ) and B s = sup |5(^ A )(«) 

ve[-l;T+l] 



Therefore we have q[ Ul ''"' Un ' m \a) ^ /(C/i, . . . , ?7 n ; m) by definition of the quantile 

Let us now provide a control in probability of f(U\, . . . , U n ; m). We control first Vs- 



Ui,...,U n 



V S = Var E ^( V i ~ U i) ~ ( n ~ m*M V l ~ U )) 
\i=i 

n 

( E - ^) - (n - l)E„(<p x (V? - U)j) 

. i=i 

r jp-l-~\_ / Th \ ^ 



< E 



Ux,...,U n 

2 



f T+1 f E Vx(v - Ui)<p x (v - U k ) + (n - 1) 2 E^a(« - £/)) I cfc 



<: 



T + 2 /,._ i \ ^ 
2 

T+~2 



{/•T+l " /-T+l 
/ Y,v\{v-u)dv+ e 
" /w =- 1 i=l J,;= - 1 l^fc<« 



<^a(^ - Ui)<p x (v - U k )dv 
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( n _ 1)2 rT+l / ,T 

+ T2 / ( / \<p\\(y-u)du] dv 



v=~l \J0 



C f T+1 2~ j n 2 

^r\ n+ Yl / t P\(v-Uj<px(v-U k )dv + —=-}, (6.10) 

with C an absolute positive constant. We have a decomposition of the second term in a sum of 
degenerate [/-statistics of order 0, 1 and 2. Indeed 

V [ T+1 Vx (v - Ui) Vx (v - U k ) dv = W Q + W 1A + Wi, 2 + W 2 , 

T77 , Jv=-l 



with 



/l +1 
[<px(v - Ui) - E w ((p\(v - U))][<p x (v - U k ) - E^xiv - U))] dv, 

Wi,i= £ [ T+1 tp x (v-Ui)En(<px(v-U))dv, 

rT+l 
Jv=-1 



and 



r/'-l 



Wo = - £ [ T+ \l(<p x (v-U))dv. 

i^.ji.^ Jv=-l 



First we control Wq : 



i w i< n(n-l)(r + 2) 2 

^C^, (6.11) 

with C an absolute positive constant. Next we deal with the control of W% \ and W\ 2 . We notice that 

rT+l 



(• 1 +1 

i=l ^=-1 



A («-t0)d« 



and consequently we have by using Lemma 6.3 of 

n rT+l 



rl + L II 

W 1)1 \ = \W 1 , 2 \^(n-l)J2 / \(pxKv-Ui)dv^^ 

i= i ^=-i 



2~-?Vf 2 

< C^— , (6.12) 



with C an absolute positive constant. 
Now it remains to control W 2 , with 



W 2 = 9(Ui,U k ), 
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where 



g(Ui, U k ) = 2 [ T+1 [ Vx (v - Ui) - ^(ip^v - U))][<p x (v - U k ) - E n {p x {v - U))} dv. 

Jv=-1 

One can apply Theorem 3.4 of [19J to W 2 and —W 2 . It implies that there exist absolute positive 
constants c±, c 2 , C3 and C4 such that with probability larger than 1 — 2 x 2.77e~ u , 

\W 2 \ ^ ciCy/uj + c 2 Duj + c 3 Blj 3/2 + c A Au} 2 

for all u > 0, where 

• A = \\g\\oo ^ 8||vja||i||va||c» ^ 8; 

• C 2 = E(Wf ) and we have 

C 2 = E (9 2 (Ui,U k )) 



^ An(n - 1)E 



T+l x 2 ' 

[^ A - U x ) - E n (<p x (v - U))][<p x (v - U 2 ) - E n ((p\(v - U))] dv 

v=-l 



We denote ^(u,U')~w<g>Tr(f(U,U')) the expectation of f(U,U') where U ~ ir and U' ~ tt are 
independent and f (v) = f(v — U). Hence, 

C 2 

21 



^ 4n(n - 1)E 



(t/,[/')~7r<g)7r 



T+l 



v=-l 



[^(v)-E^ u x (v))] (v)-E^ u x (v))] dv 



< 4n 2 E ([/)J7 ')~ 7r ® 7r 



T+l 



¥#(«)¥#»*-%' 



T+l 



u=-l 



< Cn 2 { E 



(77,£/')~T®T 



Ef/^ 

T+l 
u=-l 



T+l 



(*>)¥>>'(«)* ) ( / 



u=-l 



T+l 



+ 



E 



(£/,[/')~7r<g>7r 



v=-l 
T+l 

u=-l 



with C an absolute positive constant. But, 

fT+l 



E 



\<P%\(v)\<p%'\{v)dv 



T+l 



{U,U')~TT®1T 



E 



(£/,C')~ 7r ® 7r 



u=-l 
T+l 

u=-l 



T+l 



u=-l 



kiT^M' !(«)*) \Wx\\ 



II^aII 2 



and 



E 



(£/,C/')~*"«w 



T+l 



1^1 (t;)!^' 1(f)* 



T+l 



E ir (|^|(t;))E ir (|^|(i;))A;<(r + 2) 



<V=— 1 / JD=-1 

by using Lemma 6.3 of |34j . So, 

C 2 < Cn 2 < 
with C an absolute positive constant; 



T 2 



2-J 2~ 2 ^ 
+ 



j 1 y2 [ ' 
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D = sup|e( Yl 9(Ui,U k )ai(Ui)b k (U k ) J :E (f^a^) 2 ) < 1,E \J2b k (U k A ^ 1 

[ \l<:k<ii:n J \i=2 J \k=l J 

But, with the conditions on the ctj's and the b k s, we have: 
E Yl 9(Pi,U k )ai(Ui)b k (U k )\ 

\l^k<i^n J 

U i) - E -(^ - U))][<px(y - U k ) - E^ x (v - U))] dv ai (Ui)b k (U k ) I 

i=2k=l Jv =- 1 J 
/•T+l / n \ x 

^ 2 J _ 1 E ( v E M u ~ U i) ~ E ^a |p A (« " U k ) - E w (^ («))||6 fc (?7 fc )|J cfo 



T+l 



n-1 



/ ^ ^/(n - l)Var ff (^ A (« - ^)) E ( M u ~ U ^ ~ M ^ ( v ))\\h(U k )\ ) do 



k=l 



i)MIe / T+1 |^ - u k ) - eam v - u))\\b 

\ k=1 J v=-l 



«S 2a n 



,AU k )\dv) 



n-1 



C2j^y x \\ 2 El2\\p x \\ 1 J2\h(U k )\ 

V k=l / 



T 



In — 1 

^4a/^^||^ a || 2 ||^ a || 1 V^ : T 



< 4- 



T 

, n — 1 



by using Lemma 6.1 of |34j . Then, 



with C an absolute positive constant; 
B 2 = sup ^E^u, l/ fc ))J , with 



D^C- 



n 



T 



E(g 2 (u,U k )) 



4E 



^ 4E 



T+l 



[cpx{v -u)- En&xiv - U))][<p x {v - U k ) - E n (ip x (v ~ U))] dv 



v=-l 
T+l 



T+l 



[^(v)-E n (^(v))Y\^(v)-E^v))\ dv I \^(v)-E n (^ x (v))\ dv 

V=— 1 J v= — 1 

T+l 
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v=-l 
T+l 



[<px(v - u) - E^xiv -U))Y\cp x (v-U k )- E n ((p x (v -U))\dv 



/ [px{v-u)-E^ x {v-U))Ydvy x \\\ 



v=-l 



by using Lemma 6.3 of [33]. Hence, 



B < C 



2^n 



with C an absolute positive constant. 
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Finally, we obtain for all uj > 0, with probability larger than 1 — 2 x 2.77e w , 

' 2-->'/ 2 



|W 2 | 



n 2-% 2"^ 2 n 2~^ 2 JE »,„ 9 | 
+ -t^Vw + — p^-w + w^ 2 + w 2 } , 



(6.13) 



with C an absolute positive constant. 

Thus, by inequalities (|6.10p . (|6.1ip . (|6.12p and (|6.13p . for all a; > 0, with probability larger than 
1 - 2 x 2.77e- w , 



V S s: 



2-J'n 2 2-i/ 2 n 
n+ — — - + 



T 



(6.14) 



Then it remains to compute B$- We recall that 



Bs = sup 

«6[-l;T+l] 

1 



i=i 



77- — 1 

V\(v - Ui) E^xiv - U)) 



n 



with B$ = sup 

ue[-l;T+l] 



. Since the Haar basis is considered here, we 



< B s + TfWxh, 



J2[<Px(v-U t )-E n (<p x (v-U))} 
i=i 

can write for any 

= 2 J / 2 ^l(2fe+l)2-0'+ 1 )<a:<(fc+l)2-3 ~ ■'■/,• 2 ' .r i2/.— J)2 " ! J ' 

with A = (j, A;). Thus, 



B S < 2^ 2 + £ 2 ) 



where 



S s = sup 

ve[-l;T+l] 



[-'-fc2-^t)-C/ l ^(2A;+l)2-(J+ 1 ) ^(^k2~3 ^v-U^(2k+l)2-0+ 1 '))] 

i=l 



and 



S 5 = sup 

ue[-l;T+l] 

We observe that 



[ i (2fc+i 



)2-(J+ 1 ) <«-{7 i ^(fc+l)2-i ~~ ^(l(2fc+l)2-0'+ 1 )<^-C/^(A:+ 



l)2-i)] 



i=l 



-Bg ^ sup 

B v ,vei 



YsilBjU^-E^lBjU))} 



1=1 



where for any v € R, B v = [v - (2k + l)2-^' +1 );t> - k2~i\. We set B = {B v ,v € R} and for every 
integer n, m n (B) = sup |{^4 n B v , v G R}|. It is easy to see that 

\A\=n 



m n {B) ^ 1 + 



n(n + 1) 



and so, the VC-dimension of B defined by sup{n ^ 0, m n (B) = 2 n } is bounded by 2 (see Definition 6.2 
of |26|). By applying Lemma 6.4 of [26], we obtain: 

v^nE(Bj) <|A 

where is an absolute constant. So, with a similar argument for we obtain for any A in T 

2^ 2 r- 
E(B S ) ^ -^KV2. 
%/n 
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Consequently, 

E(B S ) < C { -= + 



/n r J ' 

with C an absolute positive constant and from Markov's inequality, we have that for all cj > 

/ ( op/ 2 2-H 2 1 \ 

P f B S > C(u) j _ + _ H < e -. (6.15) 

Thus, by combining inequalities (|6.9p . f|6. 14[) and (|6.15p . we obtain for all oj > 0, with probability 
larger than 1 - (2 x 2.77 + l)e~ w , 



m ,...,U« m )*™\ .Lta (2/a) + ^ + ^ + in (2/ a ) + I'" 2 



n V \T T 2 T 3 / 2 / w I v 7 " 2 1 

Furthermore, iV[_ 1;T+1 ] ~ V((T + 2)fj, c + n\\h\\i). Hence, 

E(JV [ _ 1 . T+1] )<C(Mc,fli)(n + T). 

From Markov's inequality, we have that for all lo > 

P (^[-i;T+l] > C(w, fli)(n + T)) < e"" (6.16) 

Then, we choose uj such that this quantity (2 x 2.77 + 2)e _aJ is equal to /3/2. So, with probability 
larger than 1 — /3/2, 

f(Ui,...,U n ;m) 



C(p,Hc,Ri) \ / - . . . /, _ N /n 2-in 2 2~i/ 2 n\ , , J 2^ 2 2^/ 2 



C{B,ii ai R x ) I a 7 . , / n 2 2-in 3 , . . . / 2^ 2 2^'/ 2 



<C(^ C ,^ V« + _ + j+ln(2/a) h^ + ^r 



Therefore by definition of Qi_^/ 2 



1 1 2-J'/ 2 V^\ , , n , v / 2^ /2 



2 



9?-fl/2 < C(/3, Mo Ri) < VW27a) — + — + v + In (2/a) + 



which concludes the proof of lLcmma 6.21 
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